† Corresponding author. E-mail:
To extract the dynamic parameters from single molecule manipulation experiments, usually lots of data at different forces need to be recorded. But the measuring time of a single molecule is limited due to breakage of the tether or degradation of the molecule. Here we propose a data analysis method based on probability maximization of the recorded time trace to extract the dynamic parameters from a single measurement. The feasibility of this method was verified by dealing with the simulation data of a two-state system. We also applied this method to estimate the parameters of DNA hairpin folding and unfolding dynamics measured by a magnetic tweezers experiment.
The single molecule manipulation method has been widely used to study the conformational dynamics of biomolecules, such as DNA unzipping and protein folding/unfolding.[1,2] Important kinetic parameters can be obtained from single molecule experiments. In a single molecule manipulation experiment, the molecule is stretched by force and the extension is measured with nanometer resolution. From the measurement of force and extension, conformational transition of the molecules can be detected.[3]
If a constant force is applied to stretch the molecule, the lifetime of a specific conformation can be obtained.[4] If the molecule is stretched by a constant loading rate (force increasing rate) or constant speed of moving away an optical trap, the transition force can be obtained.[5,6] Distributions of lifetime or transition force are usually used to extract dynamic parameters of conformational transition of the molecule.
For a theoretical model, an analytical expression of the distribution can be derived in some cases. Therefore, fitting of the analytical expression with the measured distribution can be done easily to obtain the dynamic parameters.[7] If there is no analytical expression, a model-dependent simulation is used to compare with the experimental results,[8,9] which is usually done roughly by adjusting the parameter to minimize the difference intuitively. Both methods require the distributions of unfolding force, lifetime, et al. Because the conformational transition dynamics of a single molecule is a stochastic process, lots of measurements need to be done to obtain an accurate distribution. With limited experimental data, we cannot obtain an accurate distribution. In this case, the kinetic parameters are difficult to obtain, even rough estimations.
Here we propose a systematic algorithm to obtain the optimized parameters based on the measured time trace directly. We assume that the measured time trace has maximal probability to appear among all possible time traces. In the algorithm, the kinetic parameters are optimized to maximize the probability of the measured time trace. The optimized parameter values are good estimates of their real values. We use the most widely used Bell’s model and dynamic simulation of two-state transitions to demonstrate the algorithm. Then we apply this algorithm to the single molecular manipulation experiment of a DNA hairpin.
The free energy landscape of the DNA hairpin (Fig.
If the force changes, the force-dependent transition rate
Based on Bell’s model, force-dependent unfolding rate
In each cycle of the constant loading rate simulation, the force increased from 6 pN to 13 pN with loading rate of 0.005 pN/s, 0.007 pN/s, 0.01 pN/s, 0.02 pN/s, and 0.05 pN/s, then the force decreased from 13 pN to 6 pN with loading rate of −0.005 pN/s, −0.007 pN/s, −0.01 pN/s, −0.02 pN/s, and −0.05 pN/s, respectively.
The time trace of states comes from a stochastic process. Therefore, each specific time trace has a certain probability to appear. The state transition process is a Markov process. Therefore, the probability of a specific time trace of states S1, S2, …, SN is given by
The nearest neighboring state
We assumed that the recorded time trace has the maximal probability among all possible time traces. The probability (Eq.
A small DNA hairpin with double-stranded DNA stem shorter than 30 base pairs and single-stranded DNA loop can be modeled as a two-state system: one folded DNA hairpin state stable at low stretching force, and one unfolded ssDNA state stable at high stretching force.[1] At critical force
We used the previous reported method to build the DNA hairpin construct.[7,11] A DNA hairpin with sequence of
Home-made magnetic tweezers were used to apply force to the DNA hairpin construct. The magnetic tweezers controlled the stretching force by changing the distance between the permanent magnets and the sample.[12–14] Overstretching transition of DNA handles at
In the magnetic tweezers experiment, after a single DNA hairpin tether was confirmed by the overstretching signal at
From such a quick measurement, it is difficult to obtain accurate values of the dynamic parameters related to the force-dependent transition rates, such as
To test the rationality of the proposed probability optimization algorithm, we first applied it to the case of a simple two-state transition.
At constant force, the measured time trace of the DNA hairpin comes from a simple two-state model with constant transition rates of
Here we estimated the standard deviation of the results by multiple independent simulations. 100 independent time traces were obtained with the same simulation time. From each time trace, a set of parameters
The number of recorded transitions is proportional to the simulation time. The average number of transitions was also calculated. Figure
The probability optimization method gave results similar to the traditional method (Figs.
Figures
During the measurement, if the force varies, then the transition rates change with time too. Therefore, the traditional mean dwell time method does not work. But the probability optimization algorithm still can be applied if a proper model is used to describe the force-dependent transition rates.
In a small force range, the transition rates usually follow Bell’s model (Eqs. (
We optimized the parameters from the initial values of
We applied our algorithm to experimental data of DNA hairpin folding and unfolding dynamics under variable forces which is difficult to analyze reliably. We conducted a set of experiments with eight cycles (Fig.
In the extension time trace, the state of the DNA hairpin at each time point was determined from the extension and sharp folding and unfolding transitions. By applying the optimization algorithm on the experimental data, the results of
We have proposed a probability optimization algorithm, which highly improves the efficiency of experimental data analysis, especially for limited experimental data from which a smooth distribution cannot be obtained. The algorithm was verified by simulation of two-state transitions and applied to experimental measurement of DNA hairpin unzipping. For the simulation data under both constant force condition and force-loading cycle condition, reasonable results were obtain with very limited data. The results converged to preset parameter values with increasing transition numbers recorded. For the real experiment of the DNA hairpin, the kinetic parameters were obtained with accuracy estimation.
The idea of maximum probability comes from thermodynamic principles. It is strict for a long time trace with an infinite number of transitions. While for a short time trace with a limited number of transitions, it already gives us good estimates of the kinetic parameters. For the same system, probabilities of multiple time traces can be combined together by multiplication. Therefore, all measurements can be utilized to get an estimation of the kinetic parameters. The results will gradually converge to the true value with increasing experimental data.
In magnetic tweezers, the force is controlled while the extension is measured.[13,17] In optical tweezers or atomic force microscopy experiment, the molecule is pulled by changing the position of the optical trap or distance between the sample and the cantilever, at the same time the force is measured.[6,8] For both cases, the maximum probability algorithm can be applied to determine the model parameters if a proper model is used to describe the force-dependent state transition rates.
The simple mean dwell time method can only be applied to experiments with fixed transition rates, such as the constant force experiment. While the probability optimization method can be used to analyze experimental data under any conditions, such as constant force, constant loading rate, and constant pulling speed conditions. Comparing with a Monte Carlo simulation and the unfolding force histogram comparison method, the probability optimization method does not require a large amount of data to obtain a smooth distribution. Both the Monte Carlo simulation method and the probability optimization method rely on a correct model to describe the force-dependent transition rates. In this paper, Bell’s model is used to demonstrate the application. The probability optimization method is model-dependent, which might give a meaningless result when using an improper model. In this case, a series of constant force measurements is the most direct method to get the force-dependent transition rate.[4] Dudko’s model-independent method also works well if enough experimental data can give a smooth force distribution over the interesting force range.[18]
[1] | |
[2] | |
[3] | |
[4] | |
[5] | |
[6] | |
[7] | |
[8] | |
[9] | |
[10] | |
[11] | |
[12] | |
[13] | |
[14] | |
[15] | |
[16] | |
[17] | |
[18] |